Out in the Open: Finding and Categorising Errors in the Lexical Simplification Pipeline

نویسنده

  • Matthew Shardlow
چکیده

Lexical simplification is the task of automatically reducing the complexity of a text by identifying difficult words and replacing them with simpler alternatives. Whilst this is a valuable application of natural language generation, rudimentary lexical simplification systems suffer from a high error rate which often results in nonsensical, non-simple text. This paper seeks to characterise and quantify the errors which occur in a typical baseline lexical simplification system. We expose 6 distinct categories of error and propose a classification scheme for these. We also quantify these errors for a moderate size corpus, showing the magnitude of each error type. We find that for 183 identified simplification instances, only 19 (10.38%) result in a valid simplification, with the rest causing errors of varying gravity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Reducing Lexical and Syntactic Complexity of Texts on Reading Comprehension

The present study investigated the effect of different types of text simplification (i.e., reducing the lexical and syntactic complexity of texts) on reading comprehension of English as a Foreign Language learners (EFL). Sixty female intermediate EFL learners from three intact classes in Tabarestan Language Institute in Tehran participated in the study. The intact classes were assigned to three...

متن کامل

Exploring the Role of Occurring Errors Distribution in the Distribution of Corrective Feedback Targets

This study attempted to compare corrected linguistic errors in foreign language classrooms and all errors occurring in these classes to see which types of errors are more attended to by teachers in relation to their occurrence in classes. For this purpose, 69 hours of the classes of 34 teachers teaching in different language schools were recorded and the errors corrected by these teachers were ...

متن کامل

Benchmarking Lexical Simplification Systems

Lexical Simplification is the task of replacing complex words in a text with simpler alternatives. A variety of strategies have been devised for this challenge, yet there has been little effort in comparing their performance. In this contribution, we present a benchmarking of several Lexical Simplification systems. By combining resources created in previous work with automatic spelling and infl...

متن کامل

A Survey of Automated Text Simplification

Text simplification modifies syntax and lexicon to improve the understandability of language for an end user. This survey identifies and classifies simplification research within the period 1998-2013. Simplification can be used for many applications, including: Second language learners, preprocessing in pipelines and assistive technology. There are many approaches to the simplification task, in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014